Multiplex short tandem repeat profiling of immortalized hepatic stellate cell line Col-GFP HSC

Misidentification, cross-contamination and genetic drift of continuous animal cell lines are persistent problems in biomedical research, leading to erroneous results and inconsistent or invalidated studies. The establishment of immortalized hepatic stellate cell line Col-GFP HSC was reported in PLoS One in the year 2013. In the present study a multi loci short tandem repeat signature for this cell line was established that allows for unique cell line authentication.


Introduction
In the latest version of the register of misidentified cell lines published on 8 June 2021 by the International Cell Line Authentication Committee (ICLAC) currently lists 576 misidentified cell lines [1]. In most cases, this problem results from poor and inattentive handling of cell lines combined with a lack of routine quality control. To mitigate wasted research efforts and false claims in the literature, many biomedical research funding entities, such as the NIH have announced review criteria to enhance reproducibility of research findings through increased scientific rigor and transparency. In particular, special attention is given to authentication of biological materials such as cell lines [2]. Similarly, several scientific journals have included instructions or requirements for cell line authentication in their author guidelines. PLoS One for example notes in respective guidelines "Cell line authentication is recommended-e.g., by karyotyping, isozyme analysis, or short tandem repeats (STR) analysis-and may be required during peer review or after publication" [3].
The current recommended method for cell authentication is short tandem repeat (STR) profiling [4]. A STR, also known as a microsatellite, is a non-coding, short DNA sequence composed of a number of variable repeats of two to ten nucleotides in length. In regard to human cell lines, STR profiling has been identified in 2010 as an important step to eradicate incorrectly identified cell lines [5]. Some years later, efforts were also started to develop strategies for authentication of mouse cell lines. In a landmark paper published by the National Institute of Standards and Technology (NIST), a multiplex PCR assay based on nine STR markers was established that is suitable to authenticate individual mouse cell lines [6]. Several a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 years later, a Consortium for Mouse Cell Line Authentication consisting of 12 participating laboratories was formed representing institutions from academia, industry, biological resource centers, and government with the aim to validate an extended STR marker set [7]. In the mentioned study, the respective multiplex PCR panel of mouse STR markers consisting of 19 loci was shown to be capable of discriminating at the intra-species level between 50 commonly used mouse cell lines [7].
Nowadays, reputable cell repositories including the American Type Culture Collection (ATCC), German Collection of Microorganisms and Cell Cultures (DSMZ), RIKEN Cell Bank (RCB), and the Japanese Collection of Research Bioresources (JCRB) have established authentication guidelines and maintain cell line databases that include thousands of individual cell lines. When depositing a new cell line in one of these repositories, the cell line undergoes rigorous identity and quality controls, notably authentication by STR profiling before making them available to the scientific community. As such, cell banks should be able to provide a certified DNA STR profile for each cell line distributed.
Several years ago, we generated and characterized a novel continuous cell line termed Col-GFP HSC. This cell line was derived from primary hepatic stellate cells (HSC) that were isolated from a transgenic mouse expressing green fluorescent protein (GFP) under control of the collagen α1(I) promoter/enhancer [8]. The cells were immortalized by infection with a lentivirus vector containing the Simian virus large T antigen (SV40T) and a hygromycin resistance cassette [8]. Because these cells are responsive to pro-fibrogenic stimuli, such as PDGF or TGF-β1, and are able to activate intracellular signalling pathways including Smads and MAP kinases, this cell line is a promising tool which can be used to investigate special issues of fibrogenic signaling. We have received many requests to offer this cell line to other laboratories, however, a reference genetic profile for this new cell line was not yet established.
This study reports an STR profile for this cell line including the electropherogram images that can be used as a unique fingerprint of Col-GFP HSC. The STR profile consists of the 19 species-specific markers that were proposed by the Consortium for Mouse Cell Line Authentication for authentication of mouse cell lines. This newly established reference profile can now be included in subsequent publications or grant applications to fulfill the need for authentication of key biological resources when working with Col-GFP HSC.

Cell culture
Col-GFP HSC cells were routinely propagated in 10 cm Petri dishes and cultured in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal bovine serum (FBS), 1 x Penicillin/Streptomycin, 1 mM sodium pyruvate, and 2 mM L-Glutamine. Medium was exchanged every second day and detachment of cells for subculturing was completed using Accutase solution (#A6964, Sigma-Aldrich, Merck, Darmstadt, Germany). The cells were isolated at 80% density for STR profiling, which is about 8 x 10 6 cells per 10 cm dish.

Short tandem repeat profiling
STR profiling and interspecies contamination for Col-GFP HSC was performed by a commercial service provided (IDEXX, Kornwestheim, Germany). In the respective assay, (i.e., the Cell-Check Mouse 19 panel), the cells were genotyped using a panel of 19 mouse STR markers that have been described previously [6,7]. The respective primers that are used in this assay for individual markers are given in Table 1.
This system establishes a genetic microsatellite marker profile and allows the detection of as little as 5% interspecies cross-contaminations. Individual markers were analyzed by GeneMapper software 6. The determined STR profile of Col-GFP-HSC was deposited in the Cellosaurus database under accession no. CVCL_B7MI. An STR similarity search was performed using the CLASTR 1.4.4 matching tool, which can be found on the Cellosaurus STR database (release 41.0) [9]. The settings for the search were set to the following: Scoring algorithm: Tanabe, Mode: Non-empty markers, Score filter: 70%, and Min. Markers: 8. This scoring algorithm gauges the similarity of two samples. It is simply defined as: Percent match = (number shared alleles x 2) / (total number of alleles in the questioned profile + total number of alleles in the reference profile) [10].

Results and discussion
Cross-contamination and misidentification of mammalian cell cultures is widespread, leading to thousands of misleading and potentially erroneous published papers [5]. In 2017 a conservative estimate found that 32,755 articles reported research results with misidentified cells [11]. Moreover, the mentioned study revealed that over 92% of these 'contaminated' papers is cited at least once, spreading potential misleading information as a 'secondary contamination' to the scientific community. Therefore, establishing a cell line's identity prior to performing experiments is essential to conduct valid and reproducible research.
The Col-GFP HSC cell line was established nearly ten years ago [8]. The cell line was derived from primary hepatic stellate cells (HSC) isolated from a Col-GFP transgenic reporter mouse model. Immortalization was achieved by infecting respective cells with a lentiviral vector containing the SV40T and a hygromycin resistance cassette. As such, the cells express green fluorescent protein, SV40T, and a characteristic set of HSC marker genes including α-

Marker Forward Primer (5'!3') Reverse Primer (5'!3')
AACAAAAATGTCCCTCAATGC AAGGTATATATCAAGATGGCATTATCA � Information of primer pair for marker 9-2 was taken from [6], all other primer pairs were taken from [7]. Forward primers are given without fluorescent dyes at their 5' ends and reverse primers are given without the "PIGtail" sequences (a 7-nucleotide tag with sequence GTTTCTT) that are used in this assay to promote complete adenylation. For more details of this multiplex PCR assay please refer to the original publication in which this assay was established [7].

PLOS ONE
smooth muscle actin (α-SMA), vimentin, desmin, and collagen type I [8]. Moreover, the cells are sensitive towards ligands involved in fibrosis and can be indirectly used to monitor the regulation of Collagen IαI expression [12]. Therefore, this cell system is an ideal experimental tool for cell-tracking experiments, co-culture systems or any other kind of studies in which cells of HSC origin should be investigated. However, the mentioned features are not necessarily specific for this cell line. The viral oncogene SV40T is a widely used agent to obtain immortalization or conditional reprogramming in primary cells [13]. Similarly, GFP and its derivatives have gained widespread use as a reliable and easily traceable reporter of gene expression or cellular structures in individual eukaryotic cells [14,15]. Therefore, and to fulfill actual requirements and standards for authentication of a cell line, we established a genetic profile for Col-GFP HSC that is based on 19 species-specific STR markers (Fig 1, Table 2).
The determined STR profile provides the power to discriminate Col-GFP HSC from other cell lines of mouse origin and is also suitable to detect interspecies contamination [6,7,16]. In the future, this STR profile should be used as a reference for authentication of this line. As such, it is another small step to implement the efforts to enhance reproducibility of research � The term "known allele range" corresponds to the number of repeats that might exist at the analyzed polymorphic marker sites. The allele range for STR marker 9-2 was taken from [6], all other allele ranges were taken from [7]. https://doi.org/10.1371/journal.pone.0274219.t002

PLOS ONE
findings through increased scientific rigor and transparency by authenticating key biological materials, including cell lines.
We have not performed a karyotype analysis for this cell line yet, which would be an additional alternate that could be used for cell authentication. Such analysis has been done previously in our laboratory for two other hepatic stellate cell lines from mouse and rat [17,18]. The observed numerical and structural chromosomal abnormalities in these cell lines were proposed as an alternate strategy that could be used for cell authentication. In addition, there are many other methods for cell authentication available including isoenzyme analysis, next generation-based single nucleotide profiling, and DNA fingerprinting, but STR profiling has become accepted as the simplest method to identify cross-contamination and cell misidentification [19].